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A.'CTRACT 


The main emphasis of this tutorial paper is on the formulation of 
appropriate state-space models for Kalman filtering 
applications. The so-called "model" is completely specified by 
four matrix parameters and the inltlc 1 conditions of the 
recursive equations. Once these are determined, the die is cast, 
and the way in which the measurements are weighted is determined 
foreveraf ter. Thus, finding a model that fits the physical 
situation at hand is all important. Also, it is often the most 
difficult aspect of designing a Kalman filter. Formulation of 
discrete state models from the spectral density and random 

process descriptions is discussed. Finally, it is pointed out 
that many cooimon processes encountered in applied work (such as 
band-limited white noise) simply do not lend themselves very well 
to Kalman filter modeling. 


INTRODUCTION 


Kalman filtering is now well known, and tutorial discussions of the tech- 
nique are given in a number of standard references 11,2,3). The filter 
recursive equations are summarized in Figure 1 for ref . ence purposes here. 
It should be noted that once the initial conditions and the R^, 

parameters are specified, the die is cast and the way in which the 
measurement sequence is processed is completely determined. Thus, the 
specification of these parameters is especially Important — they are, in 
effect, the filter "model". The emphasis in this tutorial paper will on 
che modeling aspect of Kalman filtering. To see where these parameters come 
from, we will now review the basic process and measurement equations. 
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Figure * Kalman filter loop 


THE DISCRETE PROCESS AND MEASUREMENT EQUATIONS 


The starting point for discrete Kalman filter theory begins with the process 
and measurement equations. The random process under consideration Is 
assumed to satisfy the following recursive equation 


k+1 


(|», X, + w. 
k k k 


( 1 ) 


where k refers to the k-uh step In time, is a vector random process. 
Is the transition matrix, and Wj^ is a Gaussian white sequence with a 
covariance structure given by 




( 2 ) 


The measurement relationship Is assumed to be of the form 


+ 

k k 


( 3 ) 


where Vj^ Is also a Gaussian white sequence, uncorrelated with Wj^, and 
described by the covariance 




( 4 ) 


In words, then, the key parameters of a Kalman filter model can be described 
as follows: 
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(1) is the transition matrix that describes the natural dynamics of 
the process in going from step k to k+K 

(2) is the linear connection matrix that gives the ideal 

(noiseless) relationship between the measurement and the 

process to be estimated 

(3) Q|^ describes the additional noise that comes into the process 
in the At interval between step k and k+l. 

(4) describes additive measurement noise. 

It is important to note that the discrete model described by Eqs* (1) 
through (4) stands In its own right. It is not an approximation of some 
continuous system, nor does it have to be related to another continuous 
linear dynamical system in any way* Once the discrete model is assumed, the 
recursive estimation process given in Fig. I follows directly. 

IMPORTANCE OF THE GAUSSL\N ASSUMPTION 

We will digress for a moment and look at the Gaussian assumption used in 
Eqs* (1) through (4), If w^^ and are Gaussian white sequences, then 
and will be Gaussian processes* Even though the Gaussian assumption is 
often omitted in discussions of least-squares filtering, we make here with 
no apology. The reason for this is that minimlziag the mean square error 
really does not make very good sense for non-Gaussian processes* To 
illustrate ^his, consider the two processes shown in Fig, 2, The first is a 
scalar Gauss-Markov process which has the general appearance of typical 
noise. The second process is the random telegraph wave which switches 
between +1 and -1 at random points in time. If the parameters of the two 
processes are adjusted appropriately, they can be made to have identical 
power spectral density functions. Yet, they are radically different 
processes! The least-squares prediction far out into the future is zero for 
both cases. This makes good sense in the Gauss-Markov case because ze^o is 
the mean and most likely value. On the other hcnd, it is ridiculous to 
predict zero in the random telegraph wave case. We know a priori that this 
waveform is never zero. We would be better off to predict either +1 or -1 
and be correct half the time than to predict zero and be wrong all the timet 
Thus, the Gaussian assumption is a reasonable one in the least squares 
estimation theory, and to stray from it leads us into dangerous territory. 





Figure 2 Gauss-Markov and random telegraph waves 


TRANSITION FROM A SPECTRAL DESCRIPTION TO A DISCRETE STATE MODEL 


In Kalman filter applicaf'ons, we frequently begin with a spectral descrip- 
tion of the various random processes Involved. The problem then is to 
convert this information to a model of the form specified by Eqs. (1) 
through (4). The general procedure for making the traasicion to the 
discrete model is as follows 

(1) Look for a continuous dynamical system that yields the desired 
process when driven by white noise. (The white noise input 
assures that W|^ will be a white sequence* ) 

(2) Then write the dynamical equations in state-space form: 

i =* Ax + Bu (5) 


(3) Solve the state equations for step size At r^nd obtain 


k+l 


►k-t "k 


( 6 ) 


(4) Determine the measurement equation from the particular situation 
at hand. 


To illustrate the procedure further, suppose the y process power spectral 

2 

density function S (s) can be written as a ratio of polynomials in s (or 
2 2 2 ^ 

0 ) , where o) * -s ). The spectral function can then always be factored into 
two symmetric parts, one with its poles and zeros in the left-half s plane, 
the other with mirror-image poles and zeros in the right-half plane* This 
is called spectral factorization and is represented mathematically as 
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(7) 






Sy(b) 


Sy(s)*Sy(s) 


where and S~ are the left- and right-half plane parts respectively. 

. y y 

S (s) then becomes the shaping filter that will shape unity white noise into 
a process y(t) with a spectral function Sy(s). (See Ref. [1] for further 
details. ) 


Now suppose that the shaping filter is of the form shown in Fig. 3. We seek 
a state-space model for that dynamical system. One way of achieving this is 
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Figure 3 Shaping filter 


sho%m in block diagram form in Fig. 4. The state-space equations are then 



Figure 4 Shaping filter redrawn 










y =* Ib b, b , ] 

^ o 1 n-1 


( 9 ) 


X 

n 


Control system engineers refer to this as the controllable canonical form, 
and it can always be achieved for the dynamical system as shown in Fig. 3. 
If y is the process that is actually measured, then the H matrix is just the 
row matrix of b’s given in Eq. (9)* 

EXAMPLE 


Suppose we have a scalar Gauss-Markov process y(t) whose power spectral 
density function is 


Sy(s) = 


2 

2CT B 


(or 


+ B 


2 

2a B . 

2 V 
0) + B 


( 10 ) 


We first factor as follows: 


s (s) 
y 


s + $ -s + 3 


( 11 ) 


The shaping filter Is then '/2o^3/(s+6) which corresponds to the dynamical 
equation 


y + 3y 



3 w(t) 


( 12 ) 


This Is a simple first order differential equation, so we only have one 
state variable. Call It Xp Our state equation Is then 


-3x, 



At) 


(13) 


The solution of this equat-jn for a step size At Is 


~3At . 

Vi ■ ‘ \ ^ “k 


(U) 


and can be seen to be the transition matrix The mean square value 
of w^ can be determined from random process theory [1], and it works out to 
be 
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(15) 


The process model is now complete* 

UNIQUENESS 

We might pose a question at this point: 

Are Kalian filter models unique? 

The answer Is an emphatic NO* Ve know from linear system theory that any 
nonsingular linear transformation on the state vector leads to another 
equally legitimate state vector. The choice of coordinate frame for 
performing the estimation process is purely a matter of convenience* 
Optimal estimates can be transformed freely from one coordinate frame to 
another (through a linear transformation) and still remain optimal estimates 
in the new frame of reference* 

ARHA MODEL 

Sometimes the random process model comes to us in the form of a difference 
equation rather than a continuous differential equation* For example, 
consider the auto-regressive moving average (aRHA) model that relates a 
discrete process y(k) to an input white sequence u(k)* 

y(k+n) + Uj^yCk+n-l) + ••• oi^y(k) = 0^u(k+n-l) + ••• 8^u(k) (16) 

There is a close analogy between difference and differential equations, and 
it works out that this nth-order difference equation can be converted to 
vector form in couch the same manner as for a differential equation* If we 
define an Intermediate variable y^ (k) as the solution to Eq* (16) with just 
u(k) as the driving function, and then define our state variables as 

Xj,(k) • y^(k), x^(k) » y‘'(k+l), etc. (17) 

then the system of Eq* (16) translates into state-space form as 
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( 18 ) 


y(W - 6„.i - sj 
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( 19 ) 


L J 

Note that our choice of state variables leads to the controllable canonical 
form, just as In the continuous dynamical case. Of course, we could have 
defined our state variables differently and arrived at a form different from 
Eqs. (18) and (19). We will not pursue this further other than to say the 
choice of state variables Is (within limits) a matter of convenience for the 
situation at hand. 


PROCESSES DERIVED FROM IRRATIONAL SHAPING FILTERS 

The random process modeling procedures discussed thus far have been 
straightforward. They may be tedious for higher-order processes, but they 
do not call for much Imagination. There exists, however, a whole class of 
processes where this is not the case. These are the processes that cannot 
be thought of as the result of passing vector white noise through a linear 
dynamical system of finite order. Such processes are comironplace in 

engineering literature. For example, bandllmlted Gaussian white noise Is a 
very useful abstraction In communication theory. It is Gaussian noise that 
has a flat spectrum in the baseband and then is zero out beyond the cutoff 
frequency. It can be thought of as the result of passing pure white noise 
through an Idealized lowpass filter, but no such filter can be represented 
as a ratio of polynomials In s of finite order. (Note that a Butterworth 
filter can be made to approximate the ideal case, but not equal It.) The 
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Idealizations of bandllmlted white noise are often a convenience In coiomuni- 
cation theory; however, they are an obstruction in Kalman filter theory. 


r 

1 , 


There is a theorem from linear systems theory that is useful at this point. 
Chen [4] gives us the following criterion for the realization of linear 
dynamical models. 

A linear dynamical model of the form 


X =* Ax + Bu 

( 20 ) 

y “ Cx + Du 

will exist for a system with an input-output impulsive response G(t,x), 
if and only if , G(t,i) is factorable in the form 

G(t,i) * M(t)N(T) (21) 


M and N are finite-order matrices, so if G(t,i) is scalar (l.e., single- 
input, single-output), M(t) is a row vector and N(t) is a column vector. 
This theorem can then be used as a test to see if a dynamical system will 
exist for a corresponding impulsive response function. Furthermore, the 
factorization provides the necessary information for realization of the 
model. (See Chen [4j for further details.) We will use flicker noise to 
illustrate the use of Chen's theorem. Flicker noise is of special interest 
to the PTTI community because of its presence in precision frequency 
standards. It is characterized by a power spectral density function of the 

o 

form of 1/f at the frequency level, or 1/f when referred to the phase level 
[5,6j. A block diagram showing the relationship between flicker noise and 
white noise is given in Fig. 5. 
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Figure 5 Block diagrams relating flicker noise to white noise 


Clearly, the transfer function relating input white noise to the output 
3/2 ■ 3/2 

phase x(t) l/s ' • The inverse transform of l/s gives the impulsive 
response to an Impulse applied at t*0. This is 2/F//tT. Thus, for an 
impulse applied at t“T, we have (in Chen's notation) 
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The question is, "Is G(t,x) factorable in the form M(t)N(T)?” It appears 
that it is not, although this is difficult to show in a rigorous sense. 
This being the case, Chen*s theorem says that no linear dynamical system 
will exist that corresponds to the G(t,x) of Eq. (22). This is to say that 
no finite-order state model will exactly represent flicker noise! Of 
course, the state model is essential for Kalman filtering, so this leads to 
a dilema when one attempts to Include flicker noise in a Kalman filter clock 
model. This is the subject of a companion paper in these Proceedings [6], 
so we will not pursue this further here. 

SUMMARY 

Various aspects of Kalman filtering modeling have been discussed briefly in 
this paper. Perhaps the most important thing to remember is that the random 
processes under consideration nxist be modeled in vector state-space form. 
This can often he done with exact methods. If the exact methods discussed 
here cannot be used, as in the case of flicker noise, then one must seek 
approximate finite-order vector models In order to form a workable Kalman 
filter. The measuiement model usually does not cause difficulty, because it 
simply depends on what state variables are beinv observed. 
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QUESTIONS AND ANSWERS 


VICTOR REINHARDT, HUGHES AIRCRAFT COMPANY: I think you are right 
about that not being able to be factored, and I think that I have 
a reason for that. You can show that flicker noise can be 
mathematically generated by the sum of an infinite number of 
gaussian processes where the beta term goes from zero to 
infinity. Therefore, there are infinite time constants in the 
process. So, you can't give a state vector at any one time, 
because the beta term goes from zero to infinity. 

.MR. BROWN: I agree with what you say. I think that it fits my 
intuition to think the same thing, and I have read that paper 
that you wrote on it. I think that it's a very nice paper, and a 
nice way to look at it. 

Other people have also approximated flicker noise with a 
cascaded sequence of what we, in control system engineering, call 
lead or lag networks, which gives kind of a staircase sort of 
frequency response function, which, to a certain degree of 
approximation, drops off at ten dB per decade rather than twenty 
dB. 

If you take any rational transfer ^unction, or one that is 
written out in integer powers, and look at the Bode plot, the 
slopes go in multiples of twenty dB per decade. There are no 
thirty dB per decade, or fifty dB per decade slopes. 

In the case of flicker noise, and consider the filter that 
shapes white noise into flicker noise, it requires an s to the 
negative one-half power in the transfer function. That would give 
a Bode olot that drops off at ten dB per decade instead of 
twenty. What you would do is approximate that ten dB decade 
slope with a whole sequence of filters with alternating zeros and 
poles. You then end up with a staircase shape response which, on 
the average, has a ten dB per decade slope. 

Incidentally, I think that this is a very good way to model 
flicker noise. The difficulty is that every time you put a new 
pole in the system you have a new state model. If you want get a 
’■'■.sonably accurate approximation of flicker noise that way, it 
3 involve escalating the order o i' the Kalman filter 
ci. nr iderably. There is nothing wrong with doing it off-li.ie for 
analysis purposes. I think that there are some on-line cases 
where it would not be accepted. 

MR. REINHARDT: I think that some people have reported on a 

Sim ir method where they used a finite number of filters and it 
worked very well in an operational case. If you try to limit that 
process though, what iiappens is that all the poles run together, 
and you end up with a branch line. 

MR. BROWN: I guess my answer to that would be that, in any of 
these processes, in the case of flicker noise for example, at 
zero frequency and out at infinity, there are singular conditions 
for either case. If it drops off as one over f, the area under 
the curve out at infinity is not finite. You are talking about a 
process with infinite variance, which is physically ridiculous. 

The same thing happens at the other end of the spectrum, the 
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area under the curve doesn't converge there, either. Physically 
it makes sens", if you want to be careful and talk about 
processes of finite variance, t\at you have to bound the power 
spectral density at the low frequency end and at the high 
frequency end. It has to roll off at least twenty dB per decade 
in order to have a process of finite variance. 

It doesn't bother me to think of putting in a filter at the 
origin which will bound the frequency content at zero frequency, 
and also put one in at the high end and make it roll off at least 
twenty dB per decade. 

Incidentally, that impulse response function is not original 
with me. Other people have .-ritten about i.hat before, including 
yourself, I think. 

JIM BARNES, A'JSTRON, INC.: I have done a fair amount of 

simulation of flicker noise with polynomials, the lead-lag 
networks you mentioned, and have one comment in their defense: 
Three or four stages ci:n do an amazing amount. You can o;ver as 
much as three to four decades of frequency with only or 

four stages. 

MR. BROWN: Oh, is that right? It isn’t as bad as it rr . i„ appear 
at first glance then. I haven't used it, but would have imagined 
that you would need a fairly large number. 

MR. REINHARDT: As another comment, even a single filter, which 
generates a random telegraph, will generate a flat Allan variance 
of about two orders of magnitude in tau, right around the peak. 
Then you really have to put a pole every order of magnitude or 
e’'en every two orders of magnitude. 

MR. BROWN: All of these are, of course, apprc.imate models fo” 
the reasons which I just cited. 

MR. ALLAN: I think, in practice, the problem with flicker noise 
is not a serious one, because it'*^ only at the extre'aes, as you 
pointed out, at zero and at infinxty chat you have difficulties 
with one over f integration. In pract ce, that's not where the 
Fourier frequencies are. In reality, a few stages of the filter 
will work very nicely in describing, predicting or simulating ? 
flicker process. 

MR. BROWN: You nee i something like that though as far as the 
Kalman x'ilter is concerned. You cdn't afford to have these 
fractional power's of s is you are going to do the state model. 
You have to have something where you o.n j/ need to worry about 
integer powers of s, and if you c'^n do that uy only adding two or 
three poles, that would be a very feasible way to approximate it 
certainly . 
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